A Tagging Approach to Identify Complex Constituents for Text Simplification
نویسندگان
چکیده
The occurrence of syntactic phenomena such as coordination and subordination is characteristic of long, complex sentences. Text simplification systems need to detect and categorise constituents in order to generate simpler sentences. These constituents are typically bounded or linked by signs of syntactic complexity, which include conjunctions, complementisers, whwords, and punctuation marks. This paper proposes a supervised tagging approach to classify these signs in accordance with their linking and bounding functions. The performance of the approach is evaluated both intrinsically, using an annotated corpus covering three different genres, and extrinsically, by evaluating the impact of classification errors on an automatic text simplification system. The results are
منابع مشابه
Relative clause extraction for syntactic simplification
This paper investigates non-destructive simplification, a type of syntactic text simplification which focuses on extracting embedded clauses from structurally complex sentences and rephrasing them without affecting their original meaning. This process reduces the average sentence length and complexity to make text simpler. Although relevant for human readers with low reading skills or language ...
متن کاملمعرفی رویکردی ماشینی با استفاده از الگوریتم لسک و برچسبدهی نحوی جهت رفع ابهام از معنای کلمات
The present study introduces a machine-based approach for word sense disambiguation (WSD). In Persian, a morphologically complex language, POS tag which lots of homographs are made, one way for doing WSD is allocating the right Part Of Speech (POS) tags to words prior to WSD. Since the frequency of noun and adjective homographs in different Persian POS tag text corpuses is high, POS tag disambi...
متن کاملExtraction of Drug-Drug Interaction from Literature through Detecting Linguistic-based Negation and Clause Dependency
Extracting biomedical relations such as drug-drug interaction (DDI) from text is an important task in biomedical NLP. Due to the large number of complex sentences in biomedical literature, researchers have employed some sentence simplification techniques to improve the performance of the relation extraction methods. However, due to difficulty of the task, there is no noteworthy improvement in t...
متن کاملQualitative and Quantitative Examination of Text Type Readabilities: A Comparative Analysis
This study compared 2 main approaches to readability assessment. Thequantitative approach applied idea density based on part of speech tagging andcompared 3 sets of text types (i.e., narrative, expository, and argumentative) withrespect to their ease of reading. The qualitative approach was done throughdeveloping questionnaires measuring intermediate EFL learners’ perceptions oncontent, motivat...
متن کاملKinetic Mechanism Reduction Using Genetic Algorithms, Case Study on H2/O2 Reaction
For large and complex reacting systems, computational efficiency becomes a critical issue in process simulation, optimization and model-based control. Mechanism simplification is often a necessity to improve computational speed. We present a novel approach to simplification of reaction networks that formulates the model reduction problem as an optimization problem and solves it using geneti...
متن کامل